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In this paper we consider the model studied in [17] - a large-scale 
service system with multiple customer classes and multiple server 
pools; interarrival and service times are exponentially distributed, 
and mean service times depend both on the customer class and server 
pool. It is assumed that the allowed activities (routing choices) form a 
tree (in the graph with vertices being both customer classes and server 
pools). We study the behavior of the system under a Leaf Activity 
Priority (LAP) policy, which assigns static priorities to the activities 
in the order of sequential "elimination" of the tree leaves. We consider 
the scaling limit of the system as the arrival rate of customers and 
number of servers in each pool tend to infinity in proportion to a 
scaling parameter r, while the overall system load remains strictly 
subcritical. Indexing the systems by parameter r, it was shown in 
[17] that the family of the invariant distributions is tight on scales 
r !/ 2 + E f or an y e > o. Namely, the sequence of invariant distributions, 
centered at the equilibrium point and scaled down by r~ i - 1 ^ 2+e \ is 
tight. 

In this paper we prove a stronger result: the invariant distributions 
are, in fact, tight on the diffusion, i.e. r 1//2 , scale. (This is the strongest 
possible tightness property for the model and the asymptotic regime 
in this paper.) As a consequence, we obtain a limit interchange result: 
the limit of diffusion-scaled invariant distributions is equal to the 
invariant distribution of the limiting diffusion process. 

1. Introduction. Large-scale heterogeneous flexible service systems nat- 
urally arise as models of large call/contact centers [1, 9], large computer 
farms (used in network cloud data centers), etc. More specifically, in this 
paper we consider a service system with multiple customer and server types 
(or classes), where the arrival rate of class i customers is Aj, the service 
rate of a class i customer by a type j server is fiij, and the server pool j 
size (the number of type j servers) is Bj. It is important that the service 
rate fiij in general depends on both the customer type i and server type j. 
Customers waiting for service are queued, and they cannot leave the system 
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before their service is complete. The system is "large-scale" in the sense that 
the input rates Aj and pool sizes Bj are large. More precisely, we will con- 
sider the "many-servers" asymptotic regime, in which the arrival rates Aj 
and pool sizes Bj scale up to infinity in proportion to a scaling parameter r, 
i.e. Aj = Ajr, Bj = (3jr, while the service rates [iij remain constant. Further- 
more, in this paper we assume that (appropriately defined) system capacity 
exceeds the (appropriately defined) traffic load by 0(r) amount - i.e. the 
system is strictly subcritically loaded. (This is different from Halfin-Whitt 
"many-servers" regime, in which the capacity exceeds the load by 0(y/r).) 

If under a given control policy the system is stable, i.e., roughly speak- 
ing, it has a stationary distribution such that the queues are stochastically 
bounded, then the average number of occupied servers in a stationary regime 
is of course 0(r). A "good" control policy would keep the steady-state sys- 
tem state within 0(y/r) of its equilibrium point, which depends on the system 
parameters and on the policy itself. More precisely, this means that the se- 
quence (in r) of the system stationary distributions, centered at equilibrium 
point and scaled down by r _1//2 , is tight. We will refer to this property as 
r 1 / 2 -scale, or diffusion- scale, tightness (of invariant distributions). 

It is typically easy to construct a policy ensuring the diffusion-scale tight- 
ness, if the system parameters Aj and fj,ij are known in advance. (It is natural 
to assume that pool sizes are available to any control policy.) In this case 
the equilibrium point can be computed in advance, and then the appro- 
priate fractions of each input flow routed to appropriate server pools. (See 
discussion in [18].) It is much more challenging to establish this property 
for "blind" policies, which do not "know" parameters Aj and /xy. In fact, 
as shown in [18], under a very natural Largest- Queue, Freest-Server Load 
Balancing (LQFS-LB) algorithm (which is a special case of the QIR policy 
in [10]), the diffusion-scale tightness does not hold in general. LQFS-LB as- 
sumes that the set of allowed "activities" (ij) (those with jjLij > 0, but not 
the actual f^j values) is known and forms a tree in the graph with vertices 
being customer and server types - let us refer to this as the tree assumption; 
otherwise, the LQFS-LB is blind. 

Another example of a blind policy (also with the tree assumption) is the 
Leaf Activity Priority (LAP) algorithm, introduced in [17]. It was shown in 
[17], that LAP ensures r 1 / 2+<E -scale tightness of invariant distributions, for 
any e > 0. In this paper we build on and extend the approach of [17], and 
prove the diffusion-scale tightness under LAP. This in turn implies a limit 
interchange property: the limit of (diffusion-scaled) invariant distributions 
is equal to the invariant distribution of the limit (diffusion) process. Proving 
this limit interchange in many-servers regime is very challenging, especially 



DIFFUSION SCALE STEADY-STATE TIGHTNESS 



3 



for general models with multiple customer and server classes; the reason is 
precisely the difficulty of establishing the diffusion-scale tightness. 

Perhaps more important than establishing the tightness and limit inter- 
change specifically for the LAP policy, is the fact that our technique seems 
quite generic, and may apply to other policies and/or other many-servers 
models. Speaking very informally, the combination of results and proofs in 
[17] and this paper gives technical "blocks" which allow one to establish the 
diffusion-scale tightness as long as the following two properties hold: 

(a) Global stability on the fluid-scale (r-scale), i.e. convergence of fluid-scaled 
trajectories to the equilibrium point (plus an additional, related property); 

(b) Local stability of the linear system in the neighborhood of the equi- 
librium point, i.e. the drift matrix of the limiting diffusion process has all 
eigenvalues with negative real parts. 

We will make this discussion more specific in Section 5. 

1.1. Brief literature review. A general overview of many-servers mod- 
els, results and aplications to call centers can be found in [9, 1]. For con- 
trol policies for general models, with multiple customer and server types, 
including blind policies, see e.g. [3, 2, 10, 16, 15, 18, 17] and references 
therein. Overviews of diffusion scale tighness (and limit interchange) re- 
sults for single-pool models in the many-servers Halfin-Whitt regime can be 
found, e.g., in [7, 6, 8]; the diffusion scale tightness for the LQFS-LB policy, 
with the tree assumption and additionally assuming that the service rate (if 
non-zero) depends only on the server type, was proven in [18]. 

2. Model. We consider the model studied in [17]. To improve self- 
containment of this paper, we repeat the necessary definitions in this section. 

2.1. The model; Static Planning Problem. Consider the system in which 
there are I customer classes, labeled 1, 2, ... ,1, and J server pools, labeled 
1,2,..., J. (Servers within pool j are referred to as class j servers. Also, 
throughout this paper the terms "class" and "type" are used interchange- 
ably.) The sets of customer classes and server pools will be denoted by X 
and 3 ' i respectively. We will use the indices i, i' to refer to customer classes, 
and j, j' to refer to server pools. 

We are interested in the scaling properties of the system as it grows large. 
Namely, we consider a sequence of systems indexed by a scaling parameter 
r. As r grows, the arrival rates and the sizes of the service pools, but not 
the speed of service, increase. Specifically, in the rth system, customers of 
type % enter the system as a Poisson process of rate \r, while the jth server 
pool has fyr individual servers. (All Aj and (3j are positive parameters.) 
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Customers may be accepted for service immediately upon arrival, or enter 
a queue; there is a separate queue for each customer type. Customers do 
not abandon the system. When a customer of type i is accepted for service 
by a server in pool j, the service time is exponential of rate the service 
rate depends both on the customer type and the server type, but not on the 
scaling parameter r. If customers of type i cannot be served by servers of 
class j, the service rate is pij = 0. 

Remark 1. Strictly speaking, the quantity far may not be an integer, 
so we should define the number of servers in pool j as, say, [f3jr\ . However, 
the change is not substantial, and will only unnecessarily complicate the 
notation. 

Consider the following static planning problem (SPP): 
(la) minp, 

subject to 

(lb) Xij > 0, Vi,j 

(lc) ^A i: , = Ai, Mi 

j 

(id) EVG8i/*i)<tt v i- 

% 

Throughout this paper we will always make the following two assumptions 
about the solution to the SPP (1): 

Assumption 2 (Complete resource pooling). The SPP (1) has a unique 
optimal solution {Xij, i £l, j € J"}, p. Define the basic activities to be the 
pairs, or edges, (ij) for which A™ > 0. Let £ be the set of basic activities. 
We further assume that the unique optimal solution is such that £ forms a 
tree in the (undirected) graph with vertices set X U J . 

Assumption 3 (Strictly subcritical load). The optimal solution to (1) 
has p < 1. 

Remark 4. Assumption 2 is the complete resource pooling (CRP) con- 
dition, which holds "generically" in a certain sense; see [16, Theorem 2.2]. 
Assumption 3 is essential for the main result of the paper. 
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We assume that the basic activity tree is known in advance, and restrict 
our attention to the basic activities only. Namely, we assume that a type i 
customer service in pool j is allowed only if (ij) G £. (Equivalently, we can 
a priori assume that £ is the set of all possible activities, i.e. //y = when 
(ij) £, and £ is a tree. In this case CRP requires that all feasible activities 
are basic.) For a customer type i, let S(i) = {j : (ij) G £}; for a server type 
j, let C(j) = {i : (ij) G £}. 

2.2. Leaf activity priority (LAP) policy. We analyze the performance of 
the following policy, which we call leaf activity priority (LAP). The first 
step in its definition is the assignment of priorities to customer classes and 
activities. 

Consider the basic activity tree, and assign priorities to the edges as fol- 
lows. First, we assign priorities to customer classes by iterating the following 
procedure: 

1. Pick a leaf of the tree; 

2. If it is a customer class (rather than a server class), assign to it the 
highest priority that hasn't yet been assigned; 

3. Remove the leaf from the tree. 

Without loss of generality, we assume the customer classes are numbered 
in order of priority (with 1 being highest). We now assign priorities to the 
edges of the basic activity tree by iterating the following procedure: 

1. Pick the highest-priority customer class; 

2. If this customer class is a leaf, pick the edge going out of it, assign this 
edge the highest priority that hasn't yet been assigned, and remove 
the edge together with the customer class; 

3. If this customer class is not a leaf, then pick any edge from it to a 
server class leaf (such necessarily exists) , assign to this edge the highest 
priority that hasn't yet been assigned, and remove the edge. 

It is not hard to verify that this algorithm will successfully assign priorities 
to all edges; it suffices to check that at any time the highest remaining 
priority customer class will have at most one outgoing edge to a non-leaf 
server class. 

Remark 5. This algorithm does not produce a unique assignment of 
priorities, neither for the customer classes nor for the activities, because 
there may be multiple options for picking a next leaf or edge to remove, 
in the corresponding procedures. This is not a problem, because our results 
hold for any such assignment. Different priority assignments may correspond 
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to different equilibrium points (defined below in Section 2.3); once we have 
picked a particular priority assignment, there is a (unique) corresponding 
equilibrium point, and we will be showing steady-state tightness around 
that point. Furthermore, the flexibility in assigning priorities may be a useful 
feature in practice. For example, it is easy to specialize the above priority 
assignment procedure so that the lowest priority is given to any a priori 
picked activity. 

We will write (ij) < (iff) to mean that activity (ij) has higher priority 
than activity iff)- It follows from the priority assignment algorithm that 
i < i! (customer class i has higher priority than i') implies (ij) < (i'f)- In 
particular, if j = f, we have (ij) < (i'j) if and only if i < i'. Without loss 
of generality, we shall assume that the server classes are numbered so that 
the lowest-priority activity is (IJ). 

Now we define the LAP policy itself. It consists of two parts: routing and 
scheduling. "Routing" determines where an arriving customer goes if it sees 
available servers of several different types. "Scheduling" determines which 
waiting customer a server picks if it sees customers of several different types 
waiting in queue. 

Routing: An arriving customer of type i picks an unoccupied server in 
the pool j £ S(i) such that (ij) < (ij') for all j' G S(i) with idle servers. If 
no server pools in S(i) have idle servers, the customer queues. 

Scheduling: A server of type j upon completing a service picks the cus- 
tomer from the queue of type i £ C(j) such that i < i' for all i! E S(i) with 
Qi> > 0. If no customer types in C(j) have queues, the server remains idle. 

We introduce the following notation (for the system with scaling parame- 
ter r) : ■ (t) , the number of servers of type j serving customers of type i at 
time t; Q\(t), the number of customers of type i waiting for service at time 
t. 

Given the system operates under the LAP policy, the process 
(ij)eS),(Ql(t), *€£)), i>0, 
is a Markov process with countable state space. 

There are some obvious relations between system variables, which hold 
for each process realization: for example, for any j S S(i) and any time t, 
either Q\(t) = or 4^, .(t) = (3jr; and so on. One additional notation: 
z j(t) = Ei ~ r EiV£ is the "idleness" of pool j. If j < J, then 

Ej^ij = Pj an d Zj(t) < 0. In what follows we will be mostly interested in 
the values of (ZUt), j < J), namely the idlenesses in all server pools except 
pool J, for which Y^i^h < @J- 
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2.3. LAP equilibrium point. Informally speaking, the equilibrium point 
(j&ij, (ij) G £), (g*, i Gl)j is the desired operating point for the (fluid 

scaled) vector (^(W^/r, (ij) G £),(Q r i /r, i of occupancies and queue 

lengths under the LAP policy. The formal definition is given below. 

Let us recursively define the quantities Ay > 0, which have the meaning 
of routing rates, scaled down by factor 1/r. (These Xij are not the same as 
those given by the optimal solution to the SPP (1).) For the activity (Ij) 
with the highest priority, define either Ay = Ai and ip*^ = ^r, or ip\ - = fij 
and Ay = fijf^ij, according to whichever is smaller. Replace Ai by Ai — Ay 
and j3j by f3j — ip*j, and remove the edge (lj) from the tree. We now proceed 
similarly with the remaining activities. 

Formally, set 

Xij = min j Xi - ^ \j',Vij ( Pj ~ ) I • 

Since the definition is in terms of higher-priority activities, this defines the 
(X^, (ij) G £) uniquely. The LAP equilibrium point is defined to be the 
vector 

((^•, (ij)e£),(q*, i€X)) 

given by 

(2) ip*j = q* = for all (ij) £ £ , i £ I. 

Hij 

Clearly, by the above construction, we have 

Xi = ^ X ij = ^ijtfij, ieX ' ^ij ~ 3 € ^ 

J 3 i 

To avoid trivial complications, throughout the paper we make the follow- 
ing assumption: 

Assumption 6. If (ipij, (ij) G £) are such that ipij > 0, A, = J2j Vijtpij, 
and Y^i ipij — f° r an J) then ipij > for all (ij) G £. 

This assumption implies, in particular, that for the equilibrium point we 
must have ip*j > for all (ij) G £ and, moreover, ^ = /3j for all j < J 
andE,V'* J </3j. 

The Assumption 6 means that the system needs to employ (on average) 
all activities in £ in order to be able to handle the load. It holds, for example, 
whenever p is sufficiently close to 1. 
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Remark 7. Assumption 6 is technical - our main result, Theorem 8, 
can be proved without it, by following the approach presented in the paper. 
But, it simplifies the statements and proofs of many auxiliary results, and 
thus substantially improves the exposition. 

2.4. Basic notation. Vector i G X), where £ can be any symbol, is 
often written as similarly, j G J) = and (r?) G £ ) = 
Furthermore, we often use notation (r]ij,£i) to mean ( (77^ , (ij) G £), i G 
X)), and similar notations as well. Unless specified otherwise, = 

^2ieC{j)^ij an ^ = Zlje5(i)^i- For functions (or random processes) 

i > 0) we often write £(•). (And similarly for functions with domain 
different from [0,oo).) So, for example, (£«(•)) signifies ((&(t),i G X), i > 0). 

In the Euclidean space R rf (with appropriate dimension d): \x\ denotes 
standard Euclidean norm of vector x; symbol — > denotes ordinary conver- 
gence; we write simply for a zero vector. Abbreviation u.o.c. means uniform 
on compact sets convergence of functions, with the domain defined explicitly 
or by the context. We always consider the Borel cr-algebra on M d when it is 
viewed as a measurable space. The symbol -4 denotes weak convergence of 
probability distributions. W.p.l means with probability 1. We will consider 
a sequence of systems indexed by scaling parameter r increasing to infinity, 
and will use abbreviation w.p.l-l.r as a short for w.p.l for all sufficiently 
large r. 

We denote by Dist[^\ the distribution of a random element £, and by 
Inv\^{-)\ the stationary distribution of a Markov process £(•) (it will be 
unique in all cases that we consider). 

3. Main result. It was shown in [17, Theorem 10] that, if the system 
under LAP policy is strictly subcritically loaded, i.e. p < 1, then for all large 
r the Markov process (^•(•), Q[(-)) is positive recurrent, has unique station- 
ary distribution Jnu[(^'y(-), Ql(-))] and, moreover, the sequence of station- 
ary distributions is tight on the scale r 1 l 2+e with any e > 0. In this paper 
we strengthen this result by showing that the invariant distributions are in 
fact tight on the diffusion, i.e. r 1 / 2 , scale. This is, of course, the strongest 
possible tightness result for the system and the asymptotic regime in this 
paper. As a consequence, we obtain a limit interchange result: the limit of 
diffusion-scaled invariant distributions is equal to the invariant distribution 
of the limiting diffusion process. 

Theorem 8. Consider the sequence of systems under LAP policy, in the 
scaling regime and under the assumptions specified in Section 2, with p < 1. 
Then, the sequence of diffusion- scaled stationary distributions, 
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Inv[r~ x / 2 {W i A-) - if>ijr,Qi (•))], is tight. Moreover, 

(3) /m^^.) _ ^ r)] 4 / nf ,[(^(.))], 

where (^ij(-)) * s ^ e diffusion process, defined by the stochastic integral equa- 
tion (26), and for any v > 

(4) Jn«[r-" (^(0, J < ■/))] ^ £M0], 
where Dist[Q] is the Dirac measure concentrated at the zero vector. 

4. Proof of Theorem 8. In the rest of the paper, we will use the 
following additional notation for the system variables. For a system with 
parameter r, we denote: 

X\ it) = Y^,j + Ql CO is the total number of customers of type i in the 

system at time t; 

A^(t) is the total number of customers of type i exogenous arrivals into the 
system in interval [0, t] ; 

D^j(t) is the total number of customers of type i that completed the service 

in pool j (and departed the system) in interval [0, t]; 

finally, we will use short notation F r (t) = — ipfjT, Qi(t}) ■ 

We can and do assume that a random realization of the system with 
parameter r is determined by its initial state and realizations of "driv- 
ing" unit-rate, mutually independent, Poisson processes U.[ a) (-),i £ 1, and 
nS* } (•),(*?) e£, as follows: 

A\(t) =U ( f\\ l rt), Dr i (i) = ng ) (/. ii £^.{^); 

the driving Poisson processes are common for all r. It is easy to see that, 
given the LAP policy, with probability 1 the realizations of these driving 
processes (along with initial state) indeed uniquely define the system process 
realization. 

Finally, the diffusion scaled variables are defined as follows: 
(*&(*), Of (*)) = r- 1/2 (*I,(t) - ^r,QUt)), 

XT(t) = r-^[XT(t) - ZjTpijr], %® = r-V^(t). 

Throughout this section, we will use the following strong approximation 
of Poisson processes, available e.g. in [4, Chapters 1 and 2]: 
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Proposition 9. A unit rate Poisson process II(-) and a standard Brow- 
nian motion W(-) can be constructed on a common probability space in such 
a way that the following holds. For some fixed positive constants C\, C 2 , C3, 
such that VT > 1 and Vu > 

P[ sup \U(t)-t-W(t)\> d log T + u) <C 2 e- C:iU . 

\0<t<T J 

We will also need the following form of a functional strong law of large 
numbers for a Poisson process. It is obtained using standard large deviations 
estimates, e.g. analogously to the way it is done in the proof of [14, Lemma 
4.3]. 

Proposition 10. For a unit rate Poisson process II(-), the following 
holds with probability 1. For any v £ (0, 1) and any c > 1, uniformly in 
t±,t 2 £ [0, r c ] such that t 2 — t\ > r v , 

[u(t 2 ) - n(*i)]/[i 2 - h] -> 1. 

Throughout this paper, we will use Proposition 10 with arbitrary fixed c > 
1: this ensures that for any fixed T > 0, the interval [0,Trlogr] is contained 
within [0, r c ] for all large r. Proposition 10, in particular, immediately implies 
the following upper bound on the rate at which system variables can change. 
There exists C > 0, such that for any v £ (0, 1) and any a > 0, w.p.l-l.r, 
uniformly in ti,t 2 £ [0,r c ~ 1 ] such that t 2 — t\ > ar v /r, 

(5) max |Qj(t) - Q r ( tl )\ < C(t 2 - h)r, Vt, 

te[ti,ta] 

and similarly for ^^(-), V(fj), ZJ(-), Vj, and F r (-). Indeed, in a system with 
parameter r, the customer arrival and departure events occur, "at most", as 
n fEi + (Y^j Pj) max (-ij) fJ'ij] 7 '^ 1 where II(-) is a unit rate Poisson process; 
therefore, the condition t 2 — t\ > ar v jr in the r-th system guarantees that 
the interval [ii,t2] corresponds to at least 0([t 2 — t±]r) = 0{r v )-\ong time 
interval for II(-), and then Proposition 10 applies. 

Lemma 11. There exists T > such that for any e £ (0, 1/2) the fol- 
lowing holds. For any 5 > 0, there exists a sufficiently large Cj > such 
that, uniformly on all sufficiently large r and all |i ?r (0)| < g(r) = r 1 / 2+e ; 
the probability of \F r (t)\ < Cjr 1 ^ 2 occurring within [0, eTlogr] is at least 
1-5. 
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Proof. The proof is by contradiction. If lemma does not hold, then there 
exists a function g*(r) such that g*(r)/r 1 / 2 f oo and the probability of start- 
ing from |i ?r (0)| < g(r) and not hitting |.F r (i)| < g*(r) within time eTlogr, 
does not vanish. We will prove that it has to vanish, thus establishing a 
contradiction. 

Denote |i ?r (0)| = h{r). We now specify the choice of T. By Corollary 25 in 
[17] we can and do choose a sufficiently large T > such that the conditions 

(6) max \ul a) (Xirt) - A 4 ri| < 5 2 h(r), Vi, 

te[o,T] 1 

and similar for Ilj^, with sufficiently small fixed 62 > 0, guarantee 

that condition g(r) > h(r) = \F r (0)\ > g*(r) implies that \F r \ decreases 
at least by a factor K > 1 in [0, T]. Let us see how the probability of (6) 
depends on h(r), or more conveniently on h%(r) = ^t(r)/r 1 ' 2 . (Note that 
hi(r) t 00 when h(r) > g*(r).) 

Now we will use Proposition 9. In its statement let us replace IT with n\ a \ 
t with Xirt, T with XirT, make u a function of r, say u = r 1//4 , and recall 
that (W(Xirt)/h(r), t > 0), where W(-) is a standard Brownian motion, 
is equal in distribution to (y/XiW (t) / 'h\{r) , t > 0). We conclude that the 
probability of (6) is lower bounded by 

where C2, C3 > are universal (from the statement of Proposition 9), while 
C4, C5 > depend on 62 and T (and system parameters). Denote 

Pi = F{\F r (t)\ < <7*(r) for some i G [0, iT] | |F r (0)| < ^(r)}, i = 0,1,2,.. 

We can write, for any £ > 1, 

Pi > [1 - C 2 e- C3rl/4 - C7 4 exp{-C5K 2l (^(r)/r 1 / 2 ) 2 }] K _ 1 . 

We are interested in with A; = elogr, which is lower bounded as 

k 

Pk > Y[[l - C 2 e- c ^ 1/4 - C 4 exp{-C 5 if 2 V(r) 2 /n] > 

i=l 

1 - ^[C 2 e- C3rl/4 + C 4 exp{-C 5 ^ 2 V(r) 2 /r}]. 

8=1 

The sum vanishes as r — )• 00, and so is 1 — p\~. □ 
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The key part of the rest of the proof of Theorem 8, is to show that, 
informally speaking, if the process "hits" the set {\F r \ < Cjr 1 ^ 2 } anywhere 
within [0, eTlogr], then it stays "on r^-scale" at time eTlogr as well. To 
do this we will exploit the closeness of the diffusion scaled process to the 
diffusion limit, on a eTlogr-long interval (i.e., with length increasing with 
r), when e is small enough. This will be formalized in Lemma 13, but to 
apply it we need an additional step, given by the following 

Lemma 12. There exist Tg > and Cg > such that the following holds. 
For any fixed Cg > 0, 5g > and vg £ (0, 1/2), uniformly on initial states 
\F r (0)\ < Cgr 1 / 2 , asr^oo, 

(7) P{ max \F r (t)\ < CgCgr 1 / 2 } 1, 

te^TgCgr- 1 / 2 ] 

(8) F{3t £ [^TgCgr- 1 ' 2 } : \{Q\{t))\ + \(Z?(t)J < J)\ < 6 9 r»»} -> 1. 

We will use this lemma (and Lemma 14 below) with < vg < 1/4. 

Proof. Let us first discuss the basic intuition behind the result, which 
is extremely simple, and will be useful not only for this proof, but for 
some other proofs in the paper as well. Within a fixed 0(r -1 / 2 ) time, 
F r (t) can change at most by 0(r 1//2 ) - see (5) - and therefore, for all 
(ij), ^ij(t)/[ip*jr] « 1 holds. Now, consider the highest-priority activity 
(lj). Suppose customer class 1 is a leaf. Then, there must exist at least 
one other activity (ij), associated with the same pool j. The arrival rate of 
type 1 is Air = fiijip^r, while the total service completion rate at pool j 
is at least /iij^(t) + ^^^(t) « ^vj^Xf + Vij^tf = Air + ^jip^r. This 
means that, since type 1 has the highest priority at pool j, the queue Qi(t), 
when non-zero, "drains" at the rate at least 0(r), "hits" r u scale within 
0(r _1 / 2 ) time and "stays there." Suppose now that class 1 is not a leaf. 
Then pool j must be a leaf, i.e. it serves type 1 exclusively, iplj = (3j, and 
there must be at least one other activity (lm), associated with type 1, im- 
plying Ai > fJ-ijiplj + ^lm^im > fJ-ijPj- The difference between type 1 arrival 
rate and the rate they are served by pool j is at least [Ai — fiijj3j\r = 0{r). 
This means that the idleness |ZJ(t)|, when non-zero, decreases at the rate 
at least O(r), "hits" r u scale within 0(r -1 / 2 ) time and "stays there." We 
"remove" activity (lj) from the activity tree. The argument proceeds by 
considering all activities (ij) in sequence, from the highest to lowest prior- 
ity; at each step either Q\{t) or Z r -[t) is "eliminated", depending on i or j, 
respectively, being a leaf of the current activity tree. The exception is when 
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j = J is the pool serving the lowest priority activity (IJ): in this case Zj(t) 
is not eliminated. We now proceed with a sketch of a formal argument - 
details can be easily "recovered" by the reader. 

The proof of (7) is an immediate consequence of (5). Indeed, for any 
Tg > 0, w.p.l-l.r, the value of \F r (t)-F r (0)\ with t £ [0, T 8 Cgr _1/2 ] is upper 
bounded by CTgCgr 1 / 2 . So, for any chosen Tg we can choose Cg > 1 + CTg. 

Property (7), in particular, means that for any fixed Tg > 0, w.p.l, for 
any [ij) £ S, uniformly in t £ [0, TgCgr -1 / 2 ] we have 

(9) %{t)Mf\ -> I- 

To prove (8), we consider and "eliminate" activities one by one, in the 
order of their priority. The choice of Tg will be made later - for now it is a 
fixed constant, and we consider the process in the interval [0, TgCgr" 1 / 2 ]. We 
start with the highest priority activity (lj). Suppose first that customer class 
1 is a leaf of the activity tree. (In this case, C(j) necessarily contains at least 
one customer class in addition to 1.) Consider any < C\ < Y^i^i f^ij^tj- 
Then, for any 8 > 0, there exists a sufficiently small 5\ > 0, such that, 
w.p.l-l.r, uniformly in t £ [0, TgCgr -1 / 2 ], condition Q\(t) > 8r m implies 
Ql(t + 5ir v »/r) - Q\{t) < -C^r^, and for any Q\(t) we have (by (5)) 
max rg [ 0j< 5 ir ^9 / r ] Qi(t + r) < Q\(t) + C5ir ug . This means that w.p.l. 

max Ql(t) < (6 + CSi)r V9 , 
te[T',T s C 9 r-y 2 ] 

where T' = 2(l/Ci)Cgr~ 1 / 2 . Note that this holds for any 5 and the cor- 
responding 8i, both of which can be chosen arbitrarily small. We conclude 
that w.p.l. 

(10) max Q\{t)/r u * -»• 0. 

te[T',T 8 C 9 r-l/2] 

This in turn implies the following. Denote by ^2) the number of type 

i customers that enter service in pool j in the time interval (ti, ^2]- F° r an Y 
fixed 8\ > 0, w.p.l, uniformly in t±,t2 £ [T' , TgCgr -1 / 2 ] such that ti — t\ > 

(11) H^(*i,t2)/[Air(t2-ti)]^l. 

Finally, note that, again by (5), w.p.l-l.r, at time T' , \F r \ is at most by 
a constant factor (depending on C\) larger than Cgr 1 / 2 . Our conclusions 
about the (lj) activity can be informally summarized as follows: within a 
time T = 2(l/Ci)C 9 r- 1 / 2 , proportional to Cgr" 1 / 2 , the value of Ql(t)/r^ 
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"drains to 0" and "stays there" (in the sense of (10)) until the end of interval 
[0, TsCgr" 1 / 2 ]; moreover, in the interval [T' , TsCgr -1 / 2 ], the rate at which 
server pool j "takes" type 1 customers is "equal" (in the sense of (11)) to 
their arrival rate Air. Therefore, starting time T' we can "eliminate" and 
"ignore" activity (lj) in the sense that we know that the rate at which 
pool j can take for service customers of the types other than 1 is "at least" 
Ei^i l J 'ij' l Pij} r - More precisely, if we denote by j(t%, £2) the number of 
times in the interval (ii, £2] when a service completion by a server in pool j 
was not followed (either immediately or after some idle period) by taking a 
type 1 customer for service, then the following holds: for any fixed 6% > 0, 
w.p.l, uniformly in t\,t2 £ [T' , TgCgr -1 / 2 ] such that ti — t\ > 5\r U9 /r, 



S, 



Moreover, \F r (T')\ is at most by a factor larger than Cgr 1 / 2 , which is the 
upper bound on |F r (0)|. 

Suppose now that class 1 is not a leaf. Then necessarily poll j is a leaf and 
j < J. In this case, by looking at the evolution of idleness Zj(t), and using 
similar arguments, we can show that, again, within a time proportional to 
Cgr" 1 / 2 , let us call it T" , the value of Z^(t)/r" 9 "drains to 0" and "stays 
there" (in the sense analogous to (10)) until the end of interval [0, TsCgr -1 / 2 ]; 
this in turn means that the rate at which type 1 customers will enter pool 
j in the interval [T" , TgCgr^ 1 / 2 ] will be "equal" (in the sense analogous to 
(11)) to nijf3jr. And again, w.p.l-l.r, \F r (T")\ is at most by a constant factor 
larger than Cgr 1 / 2 . Therefore, starting time T" we can "eliminate" activity 
(lj) in the sense that we can "ignore" pool j and "assume" that the arrival 
rate of type 1 customers in the rest of the system is "equal" to Air — fiij/3jr. 
(The latter is in the sense analogous to (12), but where we count the type 1 
arrivals that were not taken for service in the corresponding interval (£i, £2]-) 

We can proceed to "eliminate" the second-highest priority activity, and so 
on. The total time for all scaled queues Q[(t)/ r Ug and all idlenesses Z r -(t)/r U9 , 
j < J, to "drain to 0" will be proportional to Cgr -1 / 2 , say TgCgr -1 / 2 . We 
then choose Tg > Tg. We omit further details, except to emphasize again 
that property (8) does not include "idleness" Zj for the pool J serving the 
lowest-priority activity (IJ). □ 

Lemma 13. Let T > be fixed. For a sufficiently small e > the fol- 
lowing holds. For any fixed C\\ > 0, 5g > 0, and vg G (0,1/4), uniformly 
on initial states satisfying \F r (0)\ < C n r 1/2 and\(Ql(0), Vi)| + 1(^(0), j < 
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J) | <6gr*, 

(13) max \(%(t))-(%(t))\ 0, 

where (^L(-)) a (strongly) unique strong solution of the stochastic integral 
equation (26) (constructed on a common probability space with (^^(-)) ), with 
the initial state (^(0)) = (%(0)). 

To prove this lemma we will need a series of auxiliary results. 

Lemma 14. There exists C\q > such that the following holds for any 
e > 0, T > 0, C\\ > 0, 5g > and z^g G (0,1/2). As r — > oo, uniformly 
on all initial states such that \F r (0)\ < Cur 1 / 2 and |(Q£(°))I + < 
J)\ < for" , we have 

(14) P{ max |F r (t)| < r 1 / 2 ^} 1, 

L t6[0,Tlogr] 



(15) P{ max [|(Qr(t))| + |(Zj(t),j< J)|] < C 10 V 9 } ^ 1- 

te[0,Tlogr] J 

Proof. The proof of property (14) is already contained in the proof of 
[17, Theorem 10(h)] . Indeed, that proof considers the process on the interval 
[0, Tlogr] and shows that, starting with |.F r (0)| = o(r), w.p.l-l.r, |-F r (i)| 
"hits" r 1//2+e -scale somewhere within [0, Tlogr], and then "stays" on this 
scale until the end of the interval. In our case, |i ?r (0)| is already on the 
r 1//2+e -scale, and so the process w.p.ld.r stays in it in the entire interval 
[0,Tlog r). 

Given (14), to prove (15) we can "reuse" the proof of (8) of Lemma 12. In 
that proof we showed that starting |.F r (0)| = 0(r l l 2 ), w.p.ld.r, the quan- 
tity [\(Q r i(t), Vi)| + |0ZJ(i), j < J)\]/r U9 "hits r^-scale " within 0{ r - 1 / 2 )- 
time and "stays there". In our case, the initial state is already such that 
|(Q[(0), Vi)| + |(2J(0), j < J) | = 0(r u '->), and therefore this quantity stays 
0(r U9 ) in the entire interval. The fact that here we consider a much longer 
interval, namely O(logr) as opposed to 0(r -1 / 2 ), is immaterial, because 
(14), and therefore (9), holds on the entire interval and rlogr = o(r c ) (so 
that we can use Proposition 10). We omit further details. □ 

Proposition 15. There exists a set of independent standard Brownian 
motions, (-) and W^\-), constructed on the same probability space as 
the set of Poisson processes IL; (•) and Txf- (■), such that the following holds. 
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For any fixed T > 0, as r — > oo: 
for each i 



(16) sup r 

0<i<Tlogr 

and for each (ij) G £ 



(17) sup r 

0<t<riogr 



-1/4 



nj a) (rt) -rt- W; w (rt) 



(o), 



-1/4 



n^(ri)-ri-W^ J (rt) 



0, w.p.l, 



0, w.p.l. 



Proof. This follows from Proposition 9: in its statement we replace t 
with rt, T with rT log r, and u with r 1 / 8 . □ 

Proposition 16. Consider any sequence of standard Brownian mo- 
tions, Bi(-), B2(-), ■ ■ ., defined on a common probability space. (They may 
be dependent.) Let T > 0, C\i > and e £ (0, 1/4) be fixed. Then, w.p.l- 
l.r, conditions t\,ti £ [0, Tlogr] and \t2 — t\\ < C\2r~ 1 / 2+e imply that 
|S r (i 2 )-S r (t a )|<r-V8. 

Proof. This follows from basic properties of Brownian motion. Fix e' G 
(1/8, 1/4 - e/2). Then for some fixed C 13 > 0, 
(18) 

P{ max \B r (t) - B r (0)\ > r~ € '} < exp{-C 13 [r~ € ' /r~ l ' A+e ' 2 } 2 }. 

This probability decays very fast with r. We divide the interval [0, Tlogr] 
into (polynomial in r number of) Ci2r _1 / 2+e -long subintervals, and use the 
above probability estimate for each of them; by Borel-Cantelli lemma, w.p.l- 
l.r, the event (analogous to the event) in (18) will not hold for any of the 
subintervals. The result follows. □ 



Proof of Lemma 13. Suppose for each r the initial state is fixed that 
satisfies conditions of the lemma. Suppose the process, for any r, is driven 
by a common set of Poisson processes, and associated Brownian motions 
constructed on the same probability space, as specified in Proposition 15. It 
will suffice to show that for any subsequence of r, there exists a further sub- 
sequence, along which the lemma conclusion holds. So, let us fix an arbitrary 
subsequence of r. We fix any vg G (0, 1/4) and choose a further subsequence 
of r, with r increasing sufficiently fast, so that w.p.l-l.r the events in the 
displayed formulas in Lemma 14 hold. 

Denote: 

A[{t) = r-VapWtVt) - Xirt), W^ r (t) = r^wf' \\rt), 
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Using standard sample path representation (see e.g. [13]), we can write, 



for each i, and all t > 0: 

\T. . 



(19) XT(t) = JCf(O) + Al(t) - £ D r {j (/% J %(s)ds 



Switching, again in a standard way, to diffusion-scaled variables and to a 
(/-dimensional) vector form, we rewrite (19) as 

(20) (XZ(t)) = (X[(0))+(A r (t))- (y,^ faijrtr'lj* %(s)ds]t 

Suppose e G (0, 1/4) (so that we can apply Proposition 16 later). We will 
make the choice of e more specific below. 

W.p.l-l.r the following facts hold uniformly for t € [0, Tlogr]: 



(21) \Al(t) - Wt lr (t)\ < r-V\ Vi, |£r.( t ) - W^(i)| < r-V4, V (y), 

(22) K^-rt)- 1 ! /"* *r.( a )ds]t _ *| < r^/^eT log r < r^ 2 ^', V(»j), 

(23) \L>{Xl(t))-(%(t))\<r- 1/ \ 

where e' is a fixed number within (e, 1/4) and V is the linear operator, 
defined in [17, Section 5.2], which maps a vector of (centered) customer 
quantities into the vector of (centered) occupancies, assuming all queues 
and idlenesses in pools j < J are zero. Indeed: properties (21) follow from 
Proposition 15; property (22) follows from (14) in Lemma 14; property (23) 
follows from (15) in Lemma 14 and the definition of operator L' . 

Using properties (21)-(23), the sample path relation (20) implies the fol- 
lowing relation (written in vector form, with components indexed by (ij)), 
which holds w.p.l-l.r uniformly for t G [0, Tlogr]: 

(24) (%(t)) = (%(0)) + L> (Wi (o) ' r (*)-E^i ) ' r ( t ) I " 
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where |(AT(i))| < r" 1 / 9 . (Instead of 1/9 we could use any fixed number 
in (0,1/8).) Indeed, in (20) we can replace A r L and t) r - with W^' r and 
W^' r , respectively, which introduces an o(r 1 / 4 ) error by (21); then, we apply 
operator V to both sides and replace L'(X][) with (^L), which introduces 
an o(r 1 / 4 ) error by (23); finally, we replace time (V^t) -1 ^ x ^ij( ,s )^ s ]^ with 

i in Wj^' r , which introduces an Ofr 1 / 8 ) error by (22) and Proposition 16. 

In turn, (24) can be written as 
(25) 

where L\ and L 2 are some fixed matrices. (Moreover L 2 is exactly the matrix 
in the ODE (d/dt)(ipij(t)) = L 2 (ipij(t)), for the local fluid model in [17] (see 

(24) in [17]); from [17, Theorem 23], all eigenvalues of L 2 have negative real 
parts. We will use this fact later, in the conclusion of the proof of Theorem 8.) 
For each r and each initial condition (^£(0)), in addition to (25) consider 
the (strongly) unique strong solution (by Theorems 5.2.9 and 5.2.5 of [11]) 
(^ij(')) of the stochastic integral equation 

(26) (%(t)) = ((wt ] ' r (t)),(M?' r (t)))+ fl* (%(*)) ds, 

j o 

driven by the same set of Brownian motions (jW^' r (■)), (W^' T {■))) and 

with the same initial condition (^[-(O)) = (^£-(0)). Thus, solutions to both 

(25) and (26), for all r, are constructed on the same probability space asso- 
ciated with the underlying set of independent Brownian motions (and the 
corresponding Poisson processes coupled with them). W.p.l-l.r we have for 
t G [0,Tlogr]: 

\(%(t)) - {%{t))\ < mm + fL,\{%(s)) - (%(s))\ds, 

Jo 

with some scalar constant L3 > 0. By Gronwall inequality (see e.g. Theorem 
5.1 in Appendix 5 of [5]), for t G [0, eTlogr]: 

(27) \(%(t)) - 0Sfo(t)) I < r - 1 / 9 e L3eTlogr = r -i/9+ei 3 T 
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We now specify the choice of e: it is such that both — 1/8 + eL%T < and 
(for the reasons explained earlier) e < 1/4 hold. In other words, < e < 
min{l/4, 1/(9L 3 T)}. □ 

Proposition 17. Uniformly on all fixed initial conditions from a bounded 
set, the corresponding solutions to the stochastic integral equation (26) have 
the following properties. Uniformly on allt > 0, the random vector (^(t)) is 
Gaussian, with bounded mean and covariance matrix. Moreover, as t — >■ oo, 
the mean vector and the covariance matrix of (^-(i)) converges to those 
of the unique stationary distribution, which is Gaussian with 

zero mean. 

Proof. This follows from the fact that all eigenvalues of the drift matrix 
L2 have negative real parts: see (5.6.12), (5.6.13)', (5.6.14)', Problem 5.6.6 
and Theorem 5.6.7 in [11]. □ 

Conclusion of the proof of Theorem 8. Consider Markov process 
F r (-) in stationary regime. We choose T as in Lemma 11, then e as in 
Lemma 13, and consider the process in the interval [0, eTlogrj. Fix arbi- 
trary vg G (0, 1/4). The combination of [17, Theorem 10(h)], Lemma 11 and 
Lemma 12 shows the following fact: uniformly on all sufficiently large r, the 
process will "hit" a state, satisfying conditions of Lemma 13, with proba- 
bility that can be made arbitrarily close to 1 by choosing sufficiently large 
fixed Cn > 0. 

Now, suppose at some time point within [0,eTlogr] the process is in 
a state satisfying conditions of Lemma 13. First, we obtain a bound on 
|-F r (eTlogr)|. Namely, uniformly on all sufficiently large r, \F r (eT log r)\ < 
Cur 1 / 2 with probability that can be made arbitrarily close to 1 by choosing 
sufficiently large fixed C14 > 0. This follows from Lemma 13 and Propo- 
sition 17. This establishes the tightness of the sequence of Inv[(^j(-))} = 
/m; [ r -i/2^(.) _ ^*. r )j_ Second; 

we obtain a bound on 
I (Q£(eT log r)) | + |(ZJ(eTlogr), j < J)\. This is even easier - by (15) in 
Lemma 14 

P{ I (Ql(eT log r)) | + | (£J(eT log r),j < J)\ < C w 5 9 r^} -> 1. 

But, since vg can be chosen arbitrarily small, we obtain property (4). 

Given the tightness of the sequence of Jnu[(W£-('))] and property (4), it 
is straightforward to show the remaining property (3). (The argument is 
essentially same as that in the proof of [12, Theorem 8.5.1], although that 
result does not directly apply to our setting.) Consider Markov process F r (-) 
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in stationary regime. We fix arbitrary T > 0, 5g > and vg 6 (0, 1/4), and 
then a large enough parameter C\\ > 0, so that, with probability arbitrar- 
ily close to 1, the conditions on F r (0) in Lemma 13 are satisfied for all 
large r. We then pick a sufficiently small fixed e > 0, so that property (13) 
holds. Finally, using Proposition 17, we pick a sufficiently large T' > 0, so 
that Dist[(^ij(T'))] is close to Inv [(^m (•))]> uniformly on the initial states 
|(*y(0))| < Cnr 1 / 2 . (Here 'close' is just in the sense of Gaussian distri- 
bution parameters, means and covariances; or, more generally, it can be in 
the sense of Prohorov metric [5].) Note that, for all large r, T' < eTlogr. 
Applying Lemma 13, we see that, for all large r, Dist[(^j(T'))] is close to 
Dist[(4>ij(T'))], which in turn is close to Inv[(^lj (■))]; and we can make it 
arbitrarily close by rechoosing parameters. This implies (3). We omit further 
details. □ 

5. Discussion. As already mentioned in the Introduction, we believe 
that the approach developed in [17] and this paper provides quite generic 
scheme for establishing diffusion-scale tightness of invariant distributions, 
under the strictly subcritical load p < 1. The approach shows that for the 
diffusion-scale tightness to hold, it is essentially sufficient to verify the two 
key stability properties - global stability and local stability - which we (at 
a high level and informally) describe next. Let F r (-) be a process describing 
the system state deviation from the equilibrium point. (For the LAP policy, 
F r (t) = (¥£(t) - ^r,Q£(i)) as defined in this paper.) 

(a) Global stability. The fluid limit f(t), t > 0, is defined as 
lim r r- 1 F r (t), t > 0. By global stability we mean the following property: 
(a.l) the trajectories f(t) converge to 0, uniformly in the initial states from 
a bounded set. Moreover, we also require the following related property to 
hold: (a. 2) uniformly on all infinite initial states, |/(0)| = oo, each trajectory 
fit) reaches a state, where all server pools are fully occupied, and then stays 
in such a state forever. (For the LAP policy, the formal statements are [17, 
Propositions 13 and 16].) 

(b) Local stability. Suppose h(r) is a function of r such that h{r)/r — > 
and h{r)/yjr — > oo. The local fluid limit f(t), t > 0, is defined as 
lim r h(r)~ l F r (t), t > 0. Suppose, the trajectories /(•) satisfy a linear ODE 
(d/dt)f(t) = L,2f(t). By local stability we mean the property that all eigen- 
values of L2 have negative real parts. (For the LAP, the formal statement is 
[17, Theorem 23]. For the LQFS-LB policy of [18], the local stability does 
not hold.) 

Properties (a) and (b) may or may not be easy to verify for a given control 
policy; but the task of proving or disproving them is typically much easier 
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then the full task of verifying the diffusion-scale tightness. We also note 
that showing local stability may require working with the process under 
additional space and/or time scalings, such as hydrodynamic scaling for 
LAP (see [17, Section 5.2]). 

Given the global and local stability properties hold, the steps of estab- 
lishing diffusion-scale tightness of invariant distributions are as follows. 

Step 1. Existence and o{r) -scale tightness of invariant distributions. Using 
the global stability property (a.2) and employing the total (appropriately 
defined) workload in the system as a Lyapunov function, one can prove 
the positive recurrence (stochastic stability) of the process, and therefore 
existence of a stationary distribution. The proof is fairly standard, uses 
Lyapunov function average drift argument, which additionally shows that 
E|r~ 1 i ?r | is bounded, which in turn applies the tightness of distributions of 
r~ 1 F r . We then employ the global stability property (a.l) to show that, in 
fact, the invariant distributions of r~ 1 F r asymptotically concentrate at 0. 
This can be referred to as o(r)-scale tightness. (The formal result for LAP 
is in [17, Theorem 14].) 

Step 2. r l l 2+e -scale tightness. Local stability implies exponentially fast 
convergence of fluid limit trajectories /(•) to 0. In particular, for a sufficiently 
large fixed T, the norm \ f(t + T)\ < S\f(t)\, where 5 < 1. We use this, and 
probability estimates for deviations of h(r)~ 1 F r (t) from a corresponding 
local fluid limit f(t), to show that if F r (0) = h(r) = o(r) then with high 
probability |F r (T)| < 5\F r (0)\. Now, it takes O(logr) intervals of length 
T for \F r \ to "descend" from o(r) to r 1 / 2+e , and we show that this does 
in fact happen with high probability. (So, the key technical issue here is 
that we have to do probability estimates not on a finite, but on an O(logr) 
interval.) This implies r 1 / 2+e -scale tightness, for any e > 0; namely, the 
invariant distributions of r~ 1 / 2 ~ e F r asymptotically concentrate at 0. (The 
formal argument for LAP is in [17, Section 5.2].) Note that this property is 
weaker than, for example, ¥\r~ l l 2 ~ e F r \ — > 0. 

Step 3. Diffusion- scale (r 1 / 2 -scale) tightness. Here we start with the r 1//2+<E - 
scale tightness, with e > being sufficiently small. We show that if |-F r (0)| = 
0(r 1 l 2+e ), then, with high probability, |i ?r (t)| "hits the diffusion scale" 
0(r 1 / 2 ) within elogr. Again, this is achieved by considering O(logr) consec- 
utive T-long intervals, in each of which \F r \ must decrease by a factor with 
high probability, unless |.F r (i)| does hit 0(r 1//2 ). (The formal result for LAP 
is Lemma 11.) Given that, it remains to show that if |-F r (0)| = 0{r 1 / 2 ) and 
e is small enough, then for any t £ [0, elogr], we also have |-F r (t)| = 0(r 1//2 ) 
with high probability; this is done by showing the closeness of process 
r~ l l 2 F r (-) to the corresponding limiting diffusion process on the elogr-long 
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interval, and the fact that the drift matrix of the diffusion process is exactly 
the L2 matrix from the definition of local stability. (For LAP, this is done 
in the bulk of this paper, from Lemma 12 on.) This completes the proof of 
diffusion-scale tightness. Again, we note that this property is weaker than, 
for example, the boundedness of E|r _1 / 2 .F r |. (For LAP, this step involves 
showing that all queues and all pool idlenesses, except pool J serving the 
lowest priority activity, are in fact o{r v ) for any u > 0.) 

In conclusion, we remark that many (although not all) parts of the above 
scheme do rely on the strict subcriticality condition p < 1. It would be of 
interest to explore whether the approach can be extended to establishing 
diffusion-scale tightness in the Halfin-Whitt regime. 
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